{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Custom Classes and Hashing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We know that in order for an object to be usable as a key in a dictionary, it must be hashable.\n",
    "In general Python will not allow mutable types to be hashable. I explained why in previous lectures, but it boils down to key retrieval. \n",
    "\n",
    "To retrieve a key/value from a dictionary, we start with the hash of the key, mod (`%`) the size of the dictionary (allocated, not in-use). From that a sequence of search indices is generated (the probe sequence). Python then follows this probe sequence one by one, comparing the requested key with the key at that index, using `==` comparisons (technically it first compares the hasesh themselves, and f they are equal then also compares the keys). If it finds a key which compares equal then it returns that item, otherwise it continues the probe sequence until it either finds the key or sees an empty slot (which means the key does not exist in the dictionary) and bails out of the search.\n",
    "\n",
    "If we allowed the key to change, then even if it had the same hash (and hence the same probe sequence), Python would not find it unless it still compared equal.\n",
    "\n",
    "So technically it is not required that the key be immutable, what is required is that the hash and equality of the key does not change!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember the difference between equality (`=`) and identity (`is`):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "t1 = (1, 2, 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "t2 = (1, 2, 3)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t1 is t2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "t1 == t2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "d = {t1: 100}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "100"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[t1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "100"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "d[t2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, even though `t1` and `t2` are different **objects**, we can still retrieve the element from the dictionary using either one - because they compare **equal** to each other, and, in fact, **have the same hash** as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2528502973977326415, 2528502973977326415)"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(t1), hash(t2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "One of the basic premises of hashes is that if two objects compare equal, they must have the same hash."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "What happens when we create custom objects? Are these hashable?\n",
    "The answer is yes - but our objects could be mutable, how does Python create a hash for these objects then?\n",
    "It uses the memory address (`id`) of the object to compute a hash.\n",
    "\n",
    "Also, by default, different instances of a custom class instances will never compare equal, since by default it compares the memory address."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('John', 78)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4359101128, 4359101072)"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "id(p1), id(p2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p1 == p2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(-9223372036582331988, 272443817)"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(p1), hash(p2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because of this default hash calculation, we can actually use custom objects as keys in dictionaries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('Eric', 75)\n",
    "persons = {p1: 'John object', p2: 'Eric object'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Person(name=John, age=78)\n",
      "Person(name=Eric, age=75)\n"
     ]
    }
   ],
   "source": [
    "for k in persons.keys():\n",
    "    print(k)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The problem here is that the **only** way to retrieve John for example, is to request the **original** object as the key (since any other instance, even with the same attribute values would not be equal):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'John object'"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[p1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But we cannot retrieve it this way:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Person(name=John, age=78) 4359141472\n",
      "Person(name=John, age=78) 4359139736\n"
     ]
    }
   ],
   "source": [
    "p = Person('John', 78)\n",
    "print(p, id(p))\n",
    "print(p1, id(p1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see they are not the **same** object, they do not compare equal, and their hash is not the same:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(False, 272446342, -9223372036582329575)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p == p1, hash(p), hash(p1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And so:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'not found'"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons.get(p, 'not found')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This may not be the behavior we want - we might want to be able to retrieve John from the dictionary as long as the contents (or some of the contents) matches - i.e. when do we consider two Person instances **equal**.\n",
    "\n",
    "To do this we would start by implementing an `__eq__` method in our class:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('John', 78)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p1 == p2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "OK, that's great, so let's put `p1` in a dictionary and see if we can recover it using `p2`, which evaluates to equal to `p1`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "unhashable type: 'Person'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-23-9fb6d89f8843>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpersons\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m{\u001b[0m\u001b[0mp1\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'John p1'\u001b[0m\u001b[0;34m}\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'Person'"
     ]
    }
   ],
   "source": [
    "persons = {p1: 'John p1'}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Huh? Why is a Person instance suddenly unhashable?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "unhashable type: 'Person'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-25-1d5254fcc026>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mp1\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'Person'"
     ]
    }
   ],
   "source": [
    "hash(p1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The only thing we changed is we implemented the `__eq__` method. Let's just check:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "272445213"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(Person('John', 78))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "unhashable type: 'Person'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-29-31caaf6017f1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPerson\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'John'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m78\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'Person'"
     ]
    }
   ],
   "source": [
    "hash(Person('John', 78))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Yes, that's the reason... But why?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember what I said earlier, if two objects compare equal (`==`) then their hash should also compare equal.\n",
    "\n",
    "`p1` and `p2` are distinct objects, but they now compare equal, and if their hash was based on their `id` they would not have equal hashes!\n",
    "\n",
    "When we implement an `__eq__` method on a class, Python will no longer provide a default hash. Instead it automatically indicates that the class is not hashable.\n",
    "\n",
    "There is a special method `__hash__` which is used by Python when we call the `hash()` function. If that `__hash__` method **is** `None` then Python considers the object unhashable (note I am not saying the `__hash__` function returns `None`, I am saying it should just **be** `None`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "None\n"
     ]
    }
   ],
   "source": [
    "hash_func = Person.__hash__\n",
    "print(hash_func)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Notice how the __hash__ attribute is `None` - it is not a function that returns `None`."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In fact, we could have done this explicitly ourselves as well:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    __hash__ = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "unhashable type: 'Person'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-32-31caaf6017f1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPerson\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'John'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m78\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'Person'"
     ]
    }
   ],
   "source": [
    "hash(Person('John', 78))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In fact we can use this technique to mark a custom class, even if it does not implement an `__eq__` method as unhashable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    __hash__ = None"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "unhashable type: 'Person'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-34-31caaf6017f1>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mPerson\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m'John'\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m78\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: unhashable type: 'Person'"
     ]
    }
   ],
   "source": [
    "hash(Person('John', 78))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case though, we do want Person instances to be hashable so we can recover Person keys in our dictionary based on whether the objects compare equal or not.\n",
    "In this case we simply want to create a hash based on `name` and `age`. Since both of these values are themselves hashable it turns out to be pretty easy to do:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    def __hash__(self):\n",
    "        print('__hash__ called...')\n",
    "        return hash((self.name, self.age))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "True\n",
      "__hash__ called...\n",
      "__hash__ called...\n",
      "True\n"
     ]
    }
   ],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('John', 78)\n",
    "print(id(p1) is id(p2))\n",
    "print(p1 == p2)\n",
    "print(hash(p1) == hash(p2))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, `Person` objects are now hashable, and equal objects have equal hashes. Of course, if the objects are not equal they usually will have different hashes (though that is not mandatory - we'll come back to that in a bit)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [],
   "source": [
    "p3 = Person('Eric', 75)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "False\n",
      "__hash__ called...\n",
      "__hash__ called...\n",
      "False\n"
     ]
    }
   ],
   "source": [
    "print(p1 == p3)\n",
    "print(hash(p1) == hash(p3))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's just remove that print statement quick:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    def __hash__(self):\n",
    "        return hash((self.name, self.age))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's see how this works with dictionaries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('John', 78)\n",
    "p3 = Person('Eric', 75)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [],
   "source": [
    "persons = {p1: 'first John object'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'first John object'"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[p1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'first John object'"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[p2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "Person(name=Eric, age=75)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-44-cf90355be1b6>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpersons\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mp3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m: Person(name=Eric, age=75)"
     ]
    }
   ],
   "source": [
    "persons[p3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now let's try to add `p2` to the dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [],
   "source": [
    "persons[p2] = 'other (equal) John object'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{Person(name=John, age=78): 'other (equal) John object'}"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see, we actually just overwrote the value of that key - since those two keys are in fact equal (`==`)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we could not do this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "persons = {p1: 'p1', p2: 'p2'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{Person(name=John, age=78): 'p2'}"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see the key was considered the same, and hence the last value assignment was effective."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But of course we could do this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {},
   "outputs": [],
   "source": [
    "persons = {p1: 'p1', p3: 'p3'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{Person(name=John, age=78): 'p1', Person(name=Eric, age=75): 'p3'}"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "since `p1` and `p3` are not equal (`==`)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### A subtle point about ` __hash__` and `hash()`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `__hash__` method must return an integer - Python will complain otherwise:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Test:\n",
    "    def __hash__(self):\n",
    "        return 'a string'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "ename": "TypeError",
     "evalue": "__hash__ method should return an integer",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m                                 Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-52-335ee67029c9>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mhash\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mTest\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mTypeError\u001b[0m: __hash__ method should return an integer"
     ]
    }
   ],
   "source": [
    "hash(Test())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just out of interest:\n",
    "\n",
    "When we call the `hash()` function, although it in turn calls the `__hash__` method, it does something more.\n",
    "\n",
    "It will truncate the integer returned by `__hash__` to a certain width which is implementation dependent.\n",
    "\n",
    "In my case, I can see that hashes will be truncated to 64-bits:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "64"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import sys\n",
    "sys.hash_info.width"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's just see how that affects the results of our `__hash__` method:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Test:\n",
    "    def __hash__(self):\n",
    "        return 1_000_000_000_000_000_000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1000000000000000000"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(Test())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Test:\n",
    "    def __hash__(self):\n",
    "        return 10_000_000_000_000_000_000"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "776627963145224196"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(Test())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "mod = sys.hash_info.modulus"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "2305843009213693951"
      ]
     },
     "execution_count": 59,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mod"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "776627963145224196"
      ]
     },
     "execution_count": 60,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "10_000_000_000_000_000_000 % mod"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Back to equal hashes for unequal objects"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we have seen many times now, hash functions and hashable objects need to satisfy these conditions:\n",
    "1. if a == b then hash(a) == hash(b)\n",
    "2. hash(a) must be an integer\n",
    "\n",
    "But nothing specifies here that unequal objects must result in unequal hashes.\n",
    "\n",
    "The only issue with equal hashes with unequal objects is that we end up getting more collisions when looking up a key in a dictionary (refer to the earlier theory section if you want more details on this)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, let's try it out with our `Person` class, we are going to implement a hash that is going to be a constant integer. That will still satisfy conditions (1) and (2) above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Person:\n",
    "    def __init__(self, name, age):\n",
    "        self.name = name\n",
    "        self.age = age\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'Person(name={self.name}, age={self.age})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Person):\n",
    "            return self.name == other.name and self.age == other.age\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    def __hash__(self):\n",
    "        return 100"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {},
   "outputs": [],
   "source": [
    "p1 = Person('John', 78)\n",
    "p2 = Person('Eric', 75)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(100, 100)"
      ]
     },
     "execution_count": 63,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "hash(p1), hash(p2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "False"
      ]
     },
     "execution_count": 64,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "p1 == p2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {},
   "outputs": [],
   "source": [
    "persons = {p1: 'p1', p2: 'p2'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{Person(name=John, age=78): 'p1', Person(name=Eric, age=75): 'p2'}"
      ]
     },
     "execution_count": 66,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'p1'"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[p1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'p2'"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[p2]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'p1'"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "persons[Person('John', 78)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see that still works just fine.\n",
    "But let's see how performance is affected by this.\n",
    "To test this we are going to create a slightly simpler class:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Number:\n",
    "    def __init__(self, x):\n",
    "        self.x = x\n",
    "        \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Number):\n",
    "            return self.x == other.x\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    def __hash__(self):\n",
    "        return hash(self.x)        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {},
   "outputs": [],
   "source": [
    "class SameHash:\n",
    "    def __init__(self, x):\n",
    "        self.x = x\n",
    "        \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, SameHash):\n",
    "            return self.x == other.x\n",
    "        else:\n",
    "            return False\n",
    "    \n",
    "    def __hash__(self):\n",
    "        return 100   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "numbers = {Number(i): 'some value' for i in range(1_000)}\n",
    "same_hashes = {SameHash(i): 'some value' for i in range(1_000)}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'some value'"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[Number(500)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'some value'"
      ]
     },
     "execution_count": 74,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "same_hashes[SameHash(500)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And now let's time how long it takes to retrieve an element from each of those dictionaries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [],
   "source": [
    "from timeit import timeit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.008118819037918001\n"
     ]
    }
   ],
   "source": [
    "print(timeit('numbers[Number(500)]', globals=globals(), number=10_000))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.0041481230291538\n"
     ]
    }
   ],
   "source": [
    "print(timeit('same_hashes[SameHash(500)]', globals=globals(), number=10_000))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see it takes substantially longer (by a factor of more than 100x) to look up a value when we have hash collisions.\n",
    "In fact this is the reason why Python has randomized hashes for strings, dates, and a few other built in types. If these hashes were predictable it would be easy for an attacker to purposefully provide keys with the same hash to slow down the system in a denial of service attack."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So, even though that constant value we provide for a hash is technically valid, I wouldn't recommend you use something like it!!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's take a look at another practical example of where we might want to use custom hashing.\n",
    "\n",
    "Let's say we want to write a custom class to handle 2D coordinates:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Point:\n",
    "    def __init__(self, x, y):\n",
    "        self.x = x\n",
    "        self.y = y\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'({self.x}, {self.y})'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1, 2)\n"
     ]
    }
   ],
   "source": [
    "pt = Point(1, 2)\n",
    "print(pt)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this case, we actually would like to be able to put these points as keys in a dictionary.\n",
    "We certainly can as it is:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [],
   "source": [
    "points = {Point(0,0): 'pt 1', Point(1,1): 'pt 2'}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But how do we recover the value for the point (0,0) for example?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "(0, 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-81-b4f2758f9633>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpoints\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mPoint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m: (0, 0)"
     ]
    }
   ],
   "source": [
    "points[Point(0,0)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The problem of course is that Python is using a hash of the id of the points - so we need to implement a custom hash mechanism, and of course also the `__eq__` method (just because the hash of two objects is the same does not mean the objects are also equal, so to look up a key in a dictionary Python needs both a hash and equality)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Point:\n",
    "    def __init__(self, x, y):\n",
    "        self.x = x\n",
    "        self.y = y\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'({self.x}, {self.y})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, Point):\n",
    "            return self.x == other.x and self.y == other.y\n",
    "        else:\n",
    "            return False\n",
    "        \n",
    "    def __hash__(self):\n",
    "        return hash((self.x, self.y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [],
   "source": [
    "points = {Point(0, 0): 'origin', Point(1,1): 'pt at (1,1)'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'origin'"
      ]
     },
     "execution_count": 84,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "points[Point(0,0)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As you can see we now have the desired functionality.\n",
    "\n",
    "Let's actually take this a step further, and implement things in such a way that we could use a regular 2-element tuple to look up a point in the dictionary.\n",
    "\n",
    "To do this we'll have to make sure that `(x, y) == Point(x, y)` and of course make sure that in that case we also have equal hashes - but since we are already calculating the hash of a Point as the hash of the corresponding tuple, we're already fine there."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Point:\n",
    "    def __init__(self, x, y):\n",
    "        self.x = x\n",
    "        self.y = y\n",
    "        \n",
    "    def __repr__(self):\n",
    "        return f'({self.x}, {self.y})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, tuple) and len(other) == 2:\n",
    "            other = Point(*other)\n",
    "        if isinstance(other, Point):\n",
    "            return self.x == other.x and self.y == other.y\n",
    "        else:\n",
    "            return False\n",
    "        \n",
    "    def __hash__(self):\n",
    "        return hash((self.x, self.y))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [],
   "source": [
    "points = {Point(0,0): 'origin', Point(1,1): 'pt at (1,1)'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'origin'"
      ]
     },
     "execution_count": 87,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "points[Point(0,0)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'origin'"
      ]
     },
     "execution_count": 88,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "points[(0,0)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In fact:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 89,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "(0,0) == Point(0,0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You'll notice that our `Point` class is technically mutable.\n",
    "So we could do something like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [],
   "source": [
    "pt1 = Point(0,0)\n",
    "pt2 = Point(1,1)\n",
    "points = {pt1: 'origin', pt2: 'pt at (1,1)'}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "('origin', 'origin', 'origin')"
      ]
     },
     "execution_count": 91,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "points[pt1], points[Point(0,0)], points[(0,0)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But what happens if we mutate `pt1`?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 92,
   "metadata": {},
   "outputs": [],
   "source": [
    "pt1.x = 10"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 93,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(10, 0)"
      ]
     },
     "execution_count": 93,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pt1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 94,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "(10, 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-94-d0f59386f0f0>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpoints\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mpt1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m: (10, 0)"
     ]
    }
   ],
   "source": [
    "points[pt1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So we can't recover our item using `pt1`, that's because the hash of `pt1` has changed, so Python start looking in the wrong place in the dictionary.\n",
    "\n",
    "Let's see what the items are in the dictionary:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(10, 0) origin\n",
      "(1, 1) pt at (1,1)\n"
     ]
    }
   ],
   "source": [
    "for k, v in points.items():\n",
    "    print(k, v)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So can we recover that 'origin' point using a different key maybe?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 96,
   "metadata": {},
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "(10, 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-96-905b9b72ee35>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpoints\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mPoint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m: (10, 0)"
     ]
    }
   ],
   "source": [
    "points[Point(10, 0)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Also not, again because the hash under which the original point `pt1` was stored, is not the same as the new hash for that same object.\n",
    "\n",
    "This is why we should not use mutable keys in a dictionary!\n",
    "\n",
    "So, in this case, although we cannot technically enfore immutability, we can use conventions to indicate the object is supposed to be immutable:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 97,
   "metadata": {},
   "outputs": [],
   "source": [
    "class Point:\n",
    "    def __init__(self, x, y):\n",
    "        self._x = x\n",
    "        self._y = y\n",
    "    \n",
    "    @property\n",
    "    def x(self):\n",
    "        return self._x\n",
    "    \n",
    "    @property\n",
    "    def y(self):\n",
    "        return self._y\n",
    "    \n",
    "    def __repr__(self):\n",
    "        return f'({self.x}, {self.y})'\n",
    "    \n",
    "    def __eq__(self, other):\n",
    "        if isinstance(other, tuple) and len(other) == 2:\n",
    "            other = Point(*other)\n",
    "        if isinstance(other, Point):\n",
    "            return self.x == other.x and self.y == other.y\n",
    "        else:\n",
    "            return False\n",
    "        \n",
    "    def __hash__(self):\n",
    "        return hash((self.x, self.y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Everything works just as before, but making the underlying attributes `_x` and `_y` indicates these are private and should not be modified directly.\n",
    "Furthermore we only created attribute getters, not setters for `x` and `y`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 98,
   "metadata": {},
   "outputs": [],
   "source": [
    "pt = Point(0,0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 99,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0"
      ]
     },
     "execution_count": 99,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pt.x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 100,
   "metadata": {},
   "outputs": [
    {
     "ename": "AttributeError",
     "evalue": "can't set attribute",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m                            Traceback (most recent call last)",
      "\u001b[0;32m<ipython-input-100-634eaafb5eab>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mpt\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mx\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m10\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;31mAttributeError\u001b[0m: can't set attribute"
     ]
    }
   ],
   "source": [
    "pt.x = 10"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
